Careers
←Job Openings
Job Description
- Lead the architecture, design, and development of components and services to enable Machine Learning at scale.
- Responsible for data ingestion from disparate data sources like SAP sources and non-SAP sources such as iEnergy, MDMS and maintaining and extending the big data platform infrastructure that supports client’s business use cases.
- Identify and recommend the most appropriate paradigms and technology choices for batch and real-time scenarios.
- Work on finding cluster level solutions for our complex system and developed enterprise level applications followed by unit testing.
- Building the pipelines from Source to SFTP, SFTP to Hadoop landing layer using Talend.
- Develop an automated data ingestion framework using Talend to synchronize the Hadoop data with SAP HANA and vice-versa.
- Run complex queries and work on Bucketing, Partitioning, Joins and sub-queries.
- Write Big Data Advanced business application programs code in both functional and object-oriented programming.
- Implement of the complex transformations and actions using the dataframes and data sets from SPARK/SCALA.
- Develop standalone applications in Spark/Scala that reads error logs from multiple upstream data sources and run validations on it.
- Write build scripts to build applications using tools like Apache Maven, Ant, Sbt and deploy the code using Jenkins for CI/CD.
- Work on writing complex workflow jobs using Redwood and set up multiple programs scheduler system which helped in managing multiple Hadoop, Hive, Sqoop, Spark jobs.
- Closely monitor the pipeline jobs and worked on failed jobs. Deal with setting up several new property configurations within Redwood SC.
- Work on developing Kafka producers that listen to several streaming data with-in a specified duration.
- Teach and mentor other engineers on the team.
- Document the functional and technical requirements by following company defined processes and methodologies.
- Perform data cleanups and validations on streaming data using spark, spark streaming and Scala.
Required Skills:
- A minimum of bachelor's degree in computer science or equivalent.
- Cloudrea Hadoop(CDH), Cloudera Manager, Informatica Bigdata Edition(BDM), HDFS, Yarn, MapReduce, Hive, Impala, KUDU, Sqoop, Spark, Kafka, HBase, Teradata Studio Express, Teradata, Tableau, Kerberos, Active Directory, Sentry, TLS/SSL, Linux/RHEL, Unix Windows, SBT, Maven, Jenkins, Oracle, MS SQL Server, Shell Scripting, Eclipse IDE, Git, SVN
- Must have strong problem-solving and analytical skills
- Must have the ability to identify complex problems and review related information to develop and evaluate options and implement solutions.
If you are interested in working in a fast-paced, challenging, fun, entrepreneurial environment and would like to have the opportunity of being a part of this fascinating industry, Send resumes. to HSTechnologies LLC, 2801 W Parker Road, Suite #5 Plano, TX - 75023 or email your resume to hr@sbhstech.com.